Designing Semantic Kernels as Implicit Superconcept Expansions
نویسندگان
چکیده
Recently, there has been an increased interest in the exploitation of background knowledge in the context of text mining tasks, especially text classification. At the same time, kernel-based learning algorithms like Support Vector Machines have become a dominant paradigm in the text mining community. Amongst other reasons, this is also due to their capability to achieve more accurate learning results by replacing standard linear kernel (bag-of-words) with customized kernel functions which incorporate additional apriori knowledge. In this paper we propose a new approach to the design of ‘semantic smoothing kernels’ by means of an implicit superconcept expansion using well-known measures of term similarity. The experimental evaluation on two different datasets indicates that our approach consistently improves performance in situations where (i) training data is scarce or (ii) the bag-ofwords representation is too sparse to build stable models when using the linear kernel.
منابع مشابه
Efficient Linearization of Tree Kernel Functions
The combination of Support Vector Machines with very high dimensional kernels, such as string or tree kernels, suffers from two major drawbacks: first, the implicit representation of feature spaces does not allow us to understand which features actually triggered the generalization; second, the resulting computational burden may in some cases render unfeasible to use large data sets for trainin...
متن کاملConceptual Lexicon Using an Object-Oriented Language
This paper describes the construction of a lexicon representing abstract concepts. This lexicon is written by an object-oriented language, CTALK, and forms a dynamic network system controlled by object-oriented mechanisms. The content of the lexicon is constructed using a Japanese dictionary. First, entry words and their definition parts are derived from the dictionary. Second, syntactic and se...
متن کاملKernel Methods for Mining Instance Data in Ontologies
The amount of ontologies and meta data available on the Web is constantly growing. The successful application of machine learning techniques for learning of ontologies from textual data, i.e. mining for the Semantic Web, contributes to this trend. However, no principal approaches exist so far for mining from the Semantic Web. We investigate how machine learning algorithms can be made amenable f...
متن کاملSub-exponentially Localized Kernels and Frames Induced by Orthogonal Expansions
The aim of this paper is to construct sup-exponentially localized kernels and frames in the context of classical orthogonal expansions, namely, expansions in Jacobi polynomials, spherical harmonics, orthogonal polynomials on the ball and simplex, and Hermite and Laguerre functions.
متن کاملMultivariable Christoffel–Darboux Kernels and Characteristic Polynomials of Random Hermitian Matrices
We study multivariable Christoffel–Darboux kernels, which may be viewed as reproducing kernels for antisymmetric orthogonal polynomials, and also as correlation functions for products of characteristic polynomials of random Hermitian matrices. Using their interpretation as reproducing kernels, we obtain simple proofs of Pfaffian and determinant formulas, as well as Schur polynomial expansions, ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006